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■ Information and uncertainty are closely related and extensively studied concepts in 

^ ' a number of scientific disciplines such as communication theory, probability theory, 

and statistics. Increasing the information arguably reduces the uncertainty on a 
Ph ■ given random subject. Consider the uncertainty measure as the variance of a random 

variable. Given the information that its outcome is in an interval, the uncertainty is 
expected to reduce when the interval shrinks. This proposition is not generally true. 
In this paper, we provide a necessary and sufficient condition for this proposition 
when the random variable is absolutely continuous or integer valued. We also give a 



> , 

^ ■ similar result on Shannon information. 
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1 Introduction 

Information and uncertainty and their relationship are familiar notions in daily life, but 
quantifying them is not easy. Probability theory provides a platform for the study of random 
objects but does not provide a universal index. Harris (1982) proposes the (relative) entropy 
of a probability distribution as such an index. The notion of entropy goes back at least 
as far as Shannon (1948) when uncertainty and information were seen as identical: the 
quantitative uncertainty U{X) about a randomly distributed object X was thought of as 
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the amount of information that observing X would provide, since then all uncertainty about 
it would vanish. 

If X is a random variable or a random vector, then in statistical science its variance 
or covariance matrix VAr(X) is often regarded as an uncertainty measure. The variance 
is simple and popular, and it is routinely used to index the uncertainty in estimators. For 
its viability as an uncertainty measure, it is natural to ask whether increased knowledge 
about X reduces the uncertainty as measured in terms of the variance. In this vein, Zidek 
and van Eeden (2003) and Chen, van Eeden, and Zidek (2010) show that the conditional 
variance VAr(X| |X| < x) is an increasing function of x when X has a normal distribution, 
regardless of its mean and variance. In fact, this result may be regarded as a straight 
consequence of earlier results, as will be detailed. Yet it is easy to find counterexamples 
where this conditional variance of X does not increase with x. The size of the family of 
distributions for which this monotonicity holds remained an unsolved problem. 

In the search for an answer, the results in Burdett (1996) provide additional insight. For 
any given X, denote the conditional mean ^{x) = E{X\X < x) and the conditional variance 
(T^{x) = VAk{X\X < x). Both /x(x) and (t^{x) play important roles in economics, actuarial 
science, reliability theory, and many other disciplines. Burdett (1996) provides a necessary 
and sufficient condition for (t'^{x) to be a monotonic function when X is absolutely contin- 
uous with finite mean and variance. In particular, log-concavity of the density function or 
of the cumulative distribution function of X is sufficient, when the second moment of X 
is finite. In a paper on dispersion orders, Mailhot (1987) shows that when the cumulative 
distribution function of X is log-concave, the conditional distribution of X given X < x 
has increasing dispersion order in x. This order implies monotonicity of cr^(a;), but this 
result is obtained under a stronger condition than that of Burdett (1996). Mailhot (1987) 
also contains a result that covers the normal result of Chen, van Eeden, and Zidek (2010). 
There are undoubtedly more papers that contain results implying the monotonicity of cr^(x) 
under various conditions. 

In this paper, we advance the monotonicity of the uncertainty and information measures 
on several fronts. In one respect, we improve on Mailhot (1987), Burdett (1996), and Chen, 
van Eeden, and Zidek (2010) by establishing a partial order on the conditional variance. 
Let ct'^{A) = \ar{X\X g A) for any measurable set A. We give a necessary and sufficient 
condition on X with an absolute continuous distribution under which cr^(A) < <J^{B) for any 
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two intervals A C B. We say that cr^{A) with this property is partially monotonic. When 
A is a finite interval, the conditional variance is always well defined. Hence, this result is 
very general. For some distributions such as Cauchy, we establish partial monotonicity for 
special interval classes of A. This result is particularly interesting because the variance of 
the Cauchy distribution does not exist. This result is presented in Section 2. The result is 
further applied to integer- valued random variables. 

A scientific proposition stands only if the result can be repeated independently. In this 
view, the uncertainty in X may be measured by the difference in the outcomes from two 
independently conducted experiments under identical conditions. Let Xi and X2 be two 
such outcomes. The best uncertainty measure might be a function of Xi — X2. Additional 
information in the form of Xi,X2 G A should reduce the uncertainty in general. Let 
ip{u) be any increasing function in 1^1. We show that a sufficient condition for E{(p{Xi — 
X2)\Xi,X2 G A} to be partially monotonic is that the density function of X is log-concave. 
Based on this result, we further show that the conditional Shannon information of Xi — X2 
is partially monotonic under the same condition. Hence, Shannon information based on 
Xi — X2 is another sensible information measure. We present these results in Section 3. 
The paper ends with a short discussion in Section 4. 

2 Partial monotonicity of the conditional variance 

Let X be an absolutely continuous random variable. Denote its cumulative distribution 
function by F{x) and its density function as f{x). We first investigate the conditions under 
which var(X|0 < X < 6) is an increasing function of b. Based on this result, we give a 
necessary and sufficient condition for the partial monotonicity of VAr(X|X G A). 

For a fixed value of 6 > 0, and for x G [0, b], define Fi{x) = J^iFit) - F{0)}dt and 



Note that Fi{x) is a specific antiderivative of F{x), and F2{x) is a specific antiderivative of 



Without loss of generality, we assume F(0) = and F{b) > in the following derivations. 
Applying the technique of integration by parts, the conditional mean 




Fi{x). 
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Similarly, we find 



E{X'^\0 <X < 



b} = [ x'^dF{x)/F{h) 



= 6^ - 2 / xdFi{x)/F{b) 







= b'^ -2bFi{b)/F{b) + 2F2{b)/F{b). 



Consequently, we have the following expression for the conditional variance: 




Its derivative with respect to b is given by 




{f2(6)-F(6)F2(6)}. 



Hence, VAr{X|0 < X <b} is &n increasing function of b if and only if Fl{b) — F {b)F2{h) > 
for all 6 > 0. This is equivalent to F2{b) being log-concave. The above proof has closely 
followed that of Burdctt (1996). 

We now summarize the above derivation by a theorem in which -F(O) = is no longer 
assumed. 

THEOREM 1 Let X be an absolutely continuous random variable with cumulative dis- 
tribution function F{x). The conditional variance VAr{X|0 < X < b} is an increasing 
function of b if and only if 



is log-concave. 

The above theorem easily generalizes from conditioning on < X < 6 to conditioning 
on a < X < 6. The generalization leads to partial monotonicity of y/AR{X\X G A). The 
proof of the next theorem is straightforward and omitted. 

THEOREM 2 Let X be an absolutely continuous random variable with cumulative distri- 
bution function F{x). The conditional variance VAR{X|a < X < b} is increasing in b if 
and only if 





(1) 



4 



is log- concave in b, and it is decreasing in a if and only if 



a,<x<y<b 



{F{b) - F{x)}dxdy 



(2) 



is log- concave in a. 

When both conditions are satisfied for all a,b & C for some convex set C , then \ak{X\X G 
A} is partially monotonic in interval A such that A C C. 

Many closely related results have been established in the literature. Under nearly iden- 
tical conditions, Burdett (1996) establishes the mononicity of the conditional variance when 
A takes the form (—00,6]. Theorem [2] is more general because it is applicable to situations 
where the mean and variance of X do not exist, and to both finite and infinite intervals. 

Let F{x) be a cumulative distribution function and let F~'^{a) = inf{x : F{x) > a} for 
< a < 1. F has higher dispersion order than G if for any < a < /3 < 1 



When F{x) is log-concave, Mailhot (1987) shows that the conditional distribution of X given 
X < a has increased dispersion order in a and hence increased conditional variance. Hence, 
Mailhot (1987) 's result implies the result of Burdett (1996), though the latter provides a 
necessary and sufficient condition. Mailhot (1987) further shows that if the density function 
/ is log-concave, the conditional distribution of X given a < X < b decreases in a and 
increases in b in dispersion order. This result implies the partial monotonicity presented in 
Theorem [21 and completely covers the normal result of Chen, van Eeden, and Zidek (2010). 
Theorem [21 however, succeeds at giving a necessary and sufficient condition. Theorem [2] 
might be useful for establishing a dispersion order more broadly in reverse. 

It is natural to examine under what distributions Conditions ([T|) and ([2]) are satisfied. 
We now give a sufficient condition and a few examples. The notion of log-concavity plays 
a key role. Log-concavity has been studied thoroughly in mathematics. The following is a 
well-known fact; for its proof, see Bagnoli and Bergstrom (2005). 

LEMMA 1 If a function f (x) is log- concave for x € (a, 6), then the following antiderivative 



(a). 




is also log-concave for x G (a, b) whenever it is well defined. 
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Note that a = — oo and/or b = oo are special cases. According to this lemma, if a 
density function is log-concave, so is its cumulative distribution function. Clearly, if F{x) 
is log-concave, so is F{ax + /3) for any real numbers a and /3 in the corresponding interval 
for X. Another particularly useful result is as follows: 

LEMMA 2 If a cumulative distribution function F{x) is log-concave on interval C = {a,b), 
then F(x) — F{xq) is also log-concave for max(a, xq) < x < b. Similarly, F(xo) — F{x) is 
log-concave on a < x < min(xo,&). 

Proof: Note that 

dlog{F{x) - Fjxo)} _ fix) _ /(^) [i ^(^o)]'' 
dx F{x) - F(xo) F{x) \ F{x) J 

Because F{x) is log-concave over C, f{x)/F{x) is a decreasing function. At the same time, 
1 — F{xq)/F{x) is an increasing function. Hence, d\og{F{x) — F{xo)}/dx is a decreasing 
function in max(a,a;o) < x < b. Consequently, F{x) — F{xo) is log-concave. 

The proof of the second conclusion is the same. ■ 

THEOREM 3 Let X be a random variable with cumulative distribution function F(x). If 
F{x) is log-concave on interval C , then \AB.{X\X G A} < VAr{X\X E B} for any intervals 
AcB CC. 

Proof: Since F{x) is log-concave for x € C, by Lemma [U so is F{x) — F{a) for x > a and 
a € C. Applying Lemma [J twice, we find that 

/ {F{x) - F{a)}dxdy 

J a<x<y<b 

is log-concave in b over b ^ C for any given a & C. That is. Condition ([1]) is satisfied. 
Similarly, Condition ([2]) is also satisfied. The result then follows from Theorem [2j ■ 

Theorem 3 is almost a special case of Mailhot (1987) except for a difference in conditions: 
Mailhot (1987) requires the density function f{x) rather than the cumulative distribution 
function F{x) to be log-concave. As pointed out in LemmaO when the density function f{x) 
is log-concave so is F{x). A large number of well-known distributions have a log-concave 
density function. In particular, the normal distribution with any mean and variance is log- 
concave. Hence, the normal result in Chen, van Eeden, and Zidek (2010) is a special case 
of Mailhot (1987), and also of Theorem 3. 
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Many commonly used distributions have log-concave density or log-concave cumulative 
distribution functions. We selectively point out that normal, logistic, double exponential, 
Weibull {cx'^~^ exp(— x'^)) and Gamma {x'^~^ exp(— x)) with c > 1 have log-concave densities. 
Log-normal, Weibull, and Gamma with < c < 1 have log-concave cumulative distribution 
functions; see Bagnoli and Bergstrom (2005) for a more complete list. In short, the sufficient 
condition of the above theorem is broadly applicable. 

2.1 Cauchy distribution and symmetric distributions 

Let X be a random variable with a standard Cauchy distribution. Its cumulative distribu- 
tion function F{x) = tt/2 + arctan(2;) is log-concave in C = [0, oo). Hence, 



for any finite intervals A C B C [0, oo). This is a particularly interesting example because 
the variance of the Cauchy distribution does not exist, and its density function or its cu- 
mulative distribution function is not log-concave. Numerical investigations indicate that 
Condition ([T|) is not satisfied by the Cauchy distribution. Hence, var{X|X G A} is not 
partially monotonic in general. 

Suppose X is a positive random variable with decreasing density function over [0, oo). 
Then its cumulative distribution function is easily verified to be log-concave. Hence, 
var{X\X £ A} is partially monotonic on A C [0, oo). In particular, var(X|0 < X < b) is 
an increasing function of b. 

Let X be a symmetrically and absolutely continuously distributed random variable. Let 
fa{x) be the conditional density function of \X\ given \X\ < a. For any < a < 6 



This implies \Xa\ is smaller than \Xi,\ in the likelihood ratio order and hence also in stochas- 
tic order. See Theorem l.C.l in Shaked and Shanthikumar (2007, p. 43). Consequently, 
ELp{\Xa\) < Eip{\Xf,\) for any increasing function ip{-) on [0,oo), and VAR{Xa) < VAK{Xh) 
is a special case. The t-distribution, with Cauchy as a special case, is an example. 



\ar{X\X eA}< var{X\X e B} 




< X < a; 



a < X < b. 
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2.2 Discrete distributions 



Can Theorem [3] be generalized to discrete distributions? We study this problem for integer- 
valued random variables. Since integration in a discrete space becomes summation, there 
is hope that the approach for continuous random variables is still applicable. While the 
same approach can be used, we find that the conditions corresponding to ([T]) and ^ are 
too complex to be insightful. However, a simple albeit less general sufficient condition can 
be obtained. 

Let X be an integer-valued random variable and p{x) its probability mass function. 
That is, p{x) = P{X = x) for all integers x. The cumulative distribution function of X 
is then G{x) = Ylk<xPi^)- ^'-'^ define an absolutely continuous random variable Y so 
that its density function 

fiy) = Y,pik)I{k-0.5<y<k + 0.5). (3) 
k 

In other words, Y is uniform on each interval {k — 0.5, k + 0.5] provided p{k) > 0. For any 
integer a < 6, it is easily verified that 

E{X\a <X<b} = E{Y\a -0.5 <Y <b + 0.5} 

and that 

E{X'^\a <X<b} = E{Y'^\a -0.5 <Y <b + 0.5} - 1/12. 

Consequently, 

VAR{X|a <X <b} = VAR{y|a - 0.5 < y < 6 + 0.5} - 1/12. (4) 
We hence have the following lemma. 

LEMMA 3 Let X and Y be the two random variables defined earlier. A sufficient condition 
for the partial monotonicity of var(X\X G ^4) is that the cumulative distribution function 
of Y satisfies Conditions U^) and 

Proof: If the cumulative distribution function of Y satisfies Conditions ([1]) and ([2|), then 
by Theorem El the conditional variance VAR(y|y £ A) is partially monotonic. Hence, for 
any finite interval A, var{X\X € A} is also partially monotonic by This completes 
the proof. ■ 
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It is likely that this condition is also necessary. Yet the condition for this seemingly neat 
result is hard to verify so we do not explore further in this direction. Instead, we strive to 
find a few simple-to- verify sufficient conditions. 

If the probability function of X is unimodal, then its corresponding Y has a monotonic 
density function on both sides of the mode. Hence, VAr{X|X G A} is partially monotonic 
on either side of the mode. We have two specific examples for the purpose of illustration. 

Suppose X has a geometric distribution. Since its probability mass function is a de- 
creasing function, its conditional variance VAK{X\X G A} is partially monotonic. 

Suppose X has a Poisson distribution with mean ji. Then its probability mass function 
is monotonic iov x > ji and for x < fi. Therefore, yAK{X\^ < X < 6} is an increasing 
function of b and VAR{X|a < X < fi} is a decreasing function of a. 

3 Partial monotonicity of other measures of un- 
certainty or information 

In this section, we assume that X is a random variable with a density function f{x) that 
is log-concave and diffcrcntiablc. Let Xi and X2 be two independent and identically dis- 
tributed copies of X. Let A be the event < Xi, X2 < b and denote F{x) = P{0 < X < x). 
The marginal density function oi U = Xi — X2 given A is 



As discussed in the Introduction, the size of U represents the repeatability of an ex- 
perimental result. Any increasing function of |C/| and its expectation serves as an index of 
uncertainty. Thus, it is of interest to study the properties of U under condition A. We will 
show that the log-concavity provides many properties of X1 — X2, and partial monotonicity 
for a number of uncertainty measures, in particular, the Shannon information. 

LEMMA 4 The density function g{u; b) is decreasing in u for u G [0, b] for any b > 0. 

Proof. According to Theorem 1.8 in Dharmadhikari and Joag-dev (1988, p. 15), if Xi 
and X2 are two independent random variables with the same unimodal distribution, then 
Xi — X2 is also unimodal. Because a distribution with log-concave density is unimodal. 
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this result leads to the conclusion of this lemma. The result can also be easily and directly 
verified by showing that the derivative of g{u; b) with respect to u is non-positive for u> 0. 
■ 

A bivariate function K{x,y) is totally positive of order 2 (TP2) if, for every choice of 
points (x2,y2) with xi < X2 and yi < y2, we have 

K{xi,yi)K{x2,y2) > K{xi,y2)K{x2,yi). 

Let I{x) be an indicator function that is or 1 according to whether 2; > or x < 0. It is 
seen that I{b — u) is totally positive. Being a log-concave density, f{u — x)I{u — x) is also 
TP2 according to Dharmadhikari and Joag-dev (1988, p. 150). In addition, by composition 
formula (2.4) in Karlin (1968, p. 16), 

/•oo 

g{u, h) = lib- u){fiu - x)I{u - x)}dF{x)/F\b) 
Jo 

is also TP2. The following lemma is the simple implication of the total positivity of g{u; b). 
For the sake of completeness, we include a quick proof. For notational simplicity, we write 
g{u; b) for the partial derivative of g with respect to u. 

LEMMA 5 For < 61 < 62 and <u <bi, we have 

g{u; bi)g{u; 62) < g{u; bi)g{u; 62). 
Proof. Because g{u;b) is TP2, we have for any 6 > 0, 

g{u; bi)g{u + 6; 62) > g{u; b2)g{u + 6; bi). 

This implies 

g{u; bi){g{u + S; 62) - g{u; 62)} > g{u; b2){g{u + 6; bi) - g{u; 61)}. 

Dividing both sides by 6 and letting 5 — >■ 0, we get the result. ■ 

With this result, the following theorem is trivial when coupled with the concept of the 
likelihood ratio order introduced earlier. 

THEOREM 4 For any function ip{u) that increases in \u\, E{ip{U)\Xi,X2 G A) is par- 
tially monotonic. 



10 



Proof. Lemma 5 implies that log f{u; 62) — log f{u; bi) is an increasing function of u over 
u > for all < fei < 62- Hence, \U\ has a lower likelihood ratio order given < Xi,X2 < bi 
than given < Xi,X2 < 62- This implies the result. ■ 

A special example of ip{-) is of particular interest. When ip{u) = we have 

E{ip{U);A} = yar{X\X € A). Hence, we have proved again that yar{X\X € A) is 
partially monotonic. It is also easily seen that — X2I \A} is partially monotonic. 

The quality E'dXi — X2I} is known as a measure of concentration and its corresponding 
U-statistic is called Gini's mean difference (Serfling, 1980). 



3.1 Shannon information 

Let (p{u) = — log g{u; 62) for some 62 > 0. By LemmaUl this choice of f{u) is an increasing 
function of \u\. Thus, by Theorem |H when bi < 62 we have 

E{-logg{U;b2)\0 < Xi,X2 < 61} < ^{- log ^(f/; 62); < Xi,X2 < 62}. 

In other words, 

g{u; bi) log g{u; b2)du < - I g{u; 62) log g{u; b2)du. (5) 



According to Jensen's inequality (Serfling, 1980, p. 351), for any convex function and 
random variable Y , we have E(j)(Y) > (j){E{Y)) when the expectations exist. Applying this 
inequality to the convex function — log(-) and random variable 

Y = g{u-b2)lg{u-M) 

such that U has density function g{u; bi), we get 



g{u;bi)log 



9{u]b2 
g{u;bi 



du > — log 



g{u;b2 
g{u;bi 



-g{u;bi)du 



> 



because the integration of g{u;b2) over the support of g{u;bi) is no more than 1. Hence, 
we get 

- j giu; h) log giu; bi)du <- j 9{u; h) log g{u; b2)du. (6) 
Combining ([5]) and we get 

g{u;bi)logg{u;bi)du < - / g{u;b2)log g{u;b2)du. 
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Note that 

E{-logg{U;bi)\0 < Xi,X2 < h} 

is the Shannon information of U given < Xi,X2 < bi. The inequality hence impHes 
that the conditional Shannon information of U increases in b. More formally, we have the 
following theorem without proof. 

THEOREM 5 The Shannon information of U given A is a partially increasing function 
of interval A. 

This result matches our intuition well. Shannon information measures the amount of 
uncertainty in a distribution. For random variables with log-concave density, the uncertainty 
in the form oi X £ A increases when A increases. This helps to reduce the uncertainty 
measured by the conditional variance or to increase the information measured in terms of 
the Shannon information. 

4 Discussion 

One cannot help but conjecture that similar results hold for multidimensional random sub- 
jects. This would be an interesting investigation. Our discussion on integer-valued ran- 
dom variables has been limited. In addition, there exist many other versions of informa- 
tion/entropy (Harris, 1982). Our result on random variables with a log-concave density is 
clearly applicable to most of them. We hope that our results will stimulate interest in these 
areas. 
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